Skip to main content

All Questions

Tagged with
0votes
0answers
68views

Convert sciBERT to GGUF

I want to use the SciBERT weights in Ollama. Ollama accepts GGUF format, whereas SciBERT is in another format. I downloaded SciBERT from this huggingface link. I tried to convert it with llama.cpp ...
Igor Popov's user avatar
0votes
1answer
288views

Can transformer models be used to convert code from one programming language to another?

There was a question like this in 2019. I hope things have changed since then. Concretely, I am looking for a way to train a transformer model to convert code from SAS to Python. I guess the method ...
Vladimir's user avatar
1vote
0answers
57views

Can I reduce computation by only predicting response tokens in a transformer and still get the same gradients?

I have been looking at the source code of the Stanford Alpaca model and I believe that during inference, the whole instruction + response data is fed into the model normally. Then the instruction part ...
Tianchen Zheng's user avatar
1vote
0answers
59views

Surrogate model to produce time series from parameter set

Say I have a model $M$ that takes in a parameter vector $\beta$, and produces a (numerical) time series. This could be a complicated model (e.g. a bespoke enzyme reaction model), or something simple ...
Mich55's user avatar
2votes
1answer
126views

How to generate a response while considering past questions as well?

User: What is the tallest mountain? Agent: Everest User: Where is it located? # Agent hears: "Where is Everest located?" Agent: Nepal I want to be able ...
angryweasel's user avatar
0votes
0answers
888views

How do autoregressive attention mechanism work in multi-headed attention?

[LONG POST!!] I am working on a DNN model that works as an improviser to generate music sequences. The idea of generating music is based on taking a sequence of music nodes (their index representation)...
Sami's user avatar
  • 101
0votes
1answer
98views

Sentiment analysis does not handle neturals [closed]

I'm writing some financial tools, I've found highly performant models for question and answering but when it comes to sentiment analysis I haven't found anything that good. I'm trying to use ...
johnny 5's user avatar
1vote
1answer
102views

How to implement or avoid masking for transformer?

When it comes to using Transformers for image captioning is there any reason to use masking? I currently have a resnet101 encoder and am trying to use the features as the input for a transformer model ...
Gibbo0789's user avatar

close